On the Limiting Distribution of Lempel-Ziv'78 Redundancy for Memoryless Sources

نویسندگان

  • Philippe Jacquet
  • Wojciech Szpankowski
چکیده

We study the Lempel-Ziv’78 algorithm and show that its (normalized) redundancy rate tends to a Gaussian distribution for memoryless sources. We accomplish it by extending findings from our 1995 paper [4], in particular, by presenting a new simplified proof of the Central Limit Theorem (CLT) for the number of phrases in the LZ’78 algorithm. As in [4], we first analyze the asymptotic behavior of the total path length in the associated digital search tree (a DST) built from independent sequences. Then a renewal theory type argument yields CLT for LZ’78 scheme. Here, we extend our analysis of LZ’78 algorithm to present new results on the convergence of moments, moderate and large deviations, and CLT for the (normalized) redundancy. In particular we confirm that the average redundancy rate decays as 1 logn , and we find that the variance is of order 1 n where n is the length of the text.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Redundancy and Distribution of the PhraseLengths of the Fixed - Database

The Fixed-Database version of the Lempel Ziv algorithm closely resembles many versions that appear in practice. In this paper, we ascertain several key asymptotic properties of the algorithm as applied to sources with nite memory. First, we determine that for a dictionary of size n, the algorithm achieves a redundancy n = H log log n log n + o(log log n log n), where H is the entropy of the pro...

متن کامل

On the Average Redundancy Rate of the Lempel-Ziv Code

Wojciech Szpankowski'" Department of Computer Science Purdue University W. Lafayette, IN 47907 U.S.A. It was conjectured that the average redundancy rate, Tnl for the Lempel-Ziv code (LZ78) is 0(loglognflogn) where n is the length of the database sequence. However, it was also known that for infinitely many n the redundancy Tn is bounded from the below by 2/ log n. In this paper we settle the a...

متن کامل

An implementable lossy version of the Lempel-Ziv algorithm - Part I: Optimality for memoryless sources

A new lossy variant of the Fixed-Database Lempel–Ziv coding algorithm for encoding at a fixed distortion level is proposed, and its asymptotic optimality and universality for memoryless sources (with respect to bounded single-letter distortion measures) is demonstrated: As the database size m increases to infinity, the expected compression ratio approaches the rate-distortion function. The comp...

متن کامل

An Implementable Lossy Version of the Lempel - Ziv Algorithm { Part I : Optimality for Memoryless

{ A new lossy variant of the Fixed-Database Lempel-Ziv coding algorithm for encoding at a xed distortion level is proposed, and its asymptotic optimality and universality for memoryless sources (with respect to bounded single-letter distortion measures) is demonstrated: As the database size m increases to innnity, the expected compression ratio approaches the rate-distortion function. The compl...

متن کامل

A Simple Technique for Bounding the Pointwise Redundancy of the 1978 Lempel-Ziv Algorithm

Abstract: If x is a string of nite length over a nite alphabet A, let LZ(x) denote the length of the binary codeword assigned to x by the 1978 version of the Lempel-Ziv data compression algorithm, let t(x) be the number of phrases in the Lempel-Ziv parsing of x, and let (x) be the probability assigned to x by a memoryless source model. Using a very simple technique, we prove the pointwise redun...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Information Theory

دوره 60  شماره 

صفحات  -

تاریخ انتشار 2014